Estimating User Location in Social Media with Stacked Denoising Auto-encoders

نویسندگان

Ji Liu

Diana Inkpen

چکیده

Only very few users disclose their physical locations, which may be valuable and useful in applications such as marketing and security monitoring; in order to automatically detect their locations, many approaches have been proposed using various types of information, including the tweets posted by the users. It is not easy to infer the original locations from textual data, because text tends to be noisy, particularly in social media. Recently, deep learning techniques have been shown to reduce the error rate of many machine learning tasks, due to their ability to learn meaningful representations of input data. We investigate the potential of building a deep-learning architecture to infer the location of Twitter users based merely on their tweets. We find that stacked denoising auto-encoders are well suited for this task, with results comparable to state-of-the-art models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gradual training of deep denoising auto encoders

Stacked denoising auto encoders (DAEs) are well known to learn useful deep representations, which can be used to improve supervised training by initializing a deep network. We investigate a training scheme of a deep DAE, where DAE layers are gradually added and keep adapting as additional layers are added. We show that in the regime of mid-sized datasets, this gradual training provides a small ...

متن کامل

Predicting online user behaviour using deep learning algorithms

We propose a robust classifier to predict buying intentions based on user behaviour within a large e-commerce website. In this work we compare traditional machine learning techniques with the most advanced deep learning approaches. We show that both Deep Belief Networks and Stacked Denoising auto-Encoders achieved a substantial improvement by extracting features from high dimensional data durin...

متن کامل

Training Auto-encoders Effectively via Eliminating Task-irrelevant Input Variables

Auto-encoders are often used as building blocks of deep network classifier to learn feature extractors, but task-irrelevant information in the input data may lead to bad extractors and result in poor generalization performance of the network. In this paper,via dropping the task-irrelevant input variables the performance of auto-encoders can be obviously improved .Specifically, an importance-bas...

متن کامل

Japanese Sentiment Classification with Stacked Denoising Auto-Encoder using Distributed Word Representation

Traditional sentiment classification methods often require polarity dictionaries or crafted features to utilize machine learning. However, those approaches incur high costs in the making of dictionaries and/or features, which hinder generalization of tasks. Examples of these approaches include an approach that uses a polarity dictionary that cannot handle unknown or newly invented words and ano...

متن کامل